CHAPTER 3 Getting Statistical: A Short Review of Basic Statistics 37

Chapter 24 describes these and other distribution functions in more detail, and

you also encounter them throughout this book.

Distributions important to statistical testing

Some probability distributions don’t describe fluctuations in data values but

instead describe fluctuations in calculated values as part of a statistical test (when

you are calculating what’s called a test statistic). Distributions of test statistics

include the Student t, chi-square, and Fisher F distributions. Test statistics are

used to obtain the p values that result from the tests. See “Getting the language

down” later in this chapter for a definition of p values.

Introducing Statistical Inference

Statistical inference is where you draw conclusions (or infer) about a population

based on estimations from a sample from that population. The challenge posed by

statistical inference theory is to extract real information from the noise in our

data. This noise is made up of these random fluctuations as well as measurement

error. This very broad area of statistical theory can be subdivided into two topics:

statistical estimation theory and statistical decision theory.

Statistical estimation theory

Statistical estimation theory focuses how to improve the accuracy and precision of

metrics calculated from samples. It provides methods to estimate how precise

your measurements are to the true population value, and to calculate the range of

values from your sample that’s likely to include the true population value. The

following sections review the fundamentals of statistical estimation theory.

Accuracy and precision

Whenever you make an estimation or measurement, your estimated or measured

value can differ from the truth by being inaccurate, imprecise, or both.»

» Accuracy refers to how close your measurement tends to come to the true

value, without being systematically biased in one direction or another. Such a

bias is called a systematic error

» Precision refers to how close several replicate measurements come to each

other — that is, how reproducible they are.